Goto

Collaborating Authors

 vector regression


TSVR+: Twin support vector regression with privileged information

arXiv.org Artificial Intelligence

In the realm of machine learning, the data may contain additional attributes, known as privileged information (PI). The main purpose of PI is to assist in the training of the model and then utilize the acquired knowledge to make predictions for unseen samples. Support vector regression (SVR) is an effective regression model, however, it has a low learning speed due to solving a convex quadratic problem (QP) subject to a pair of constraints. In contrast, twin support vector regression (TSVR) is more efficient than SVR as it solves two QPs each subject to one set of constraints. However, TSVR and its variants are trained only on regular features and do not use privileged features for training. To fill this gap, we introduce a fusion of TSVR with learning using privileged information (LUPI) and propose a novel approach called twin support vector regression with privileged information (TSVR+). The regularization terms in the proposed TSVR+ capture the essence of statistical learning theory and implement the structural risk minimization principle. We use the successive overrelaxation (SOR) technique to solve the optimization problem of the proposed TSVR+, which enhances the training efficiency. As far as our knowledge extends, the integration of the LUPI concept into twin variants of regression models is a novel advancement. The numerical experiments conducted on UCI, stock and time series data collectively demonstrate the superiority of the proposed model.


Robust Brain Age Estimation via Regression Models and MRI-derived Features

arXiv.org Artificial Intelligence

The determination of biological brain age is a crucial biomarker in the assessment of neurological disorders and understanding of the morphological changes that occur during aging. Various machine learning models have been proposed for estimating brain age through Magnetic Resonance Imaging (MRI) of healthy controls. However, developing a robust brain age estimation (BAE) framework has been challenging due to the selection of appropriate MRI-derived features and the high cost of MRI acquisition. In this study, we present a novel BAE framework using the Open Big Healthy Brain (OpenBHB) dataset, which is a new multi-site and publicly available benchmark dataset that includes region-wise feature metrics derived from T1-weighted (T1-w) brain MRI scans of 3965 healthy controls aged between 6 to 86 years. Our approach integrates three different MRI-derived region-wise features and different regression models, resulting in a highly accurate brain age estimation with a Mean Absolute Error (MAE) of 3.25 years, demonstrating the framework's robustness. We also analyze our model's regression-based performance on gender-wise (male and female) healthy test groups. The proposed BAE framework provides a new approach for estimating brain age, which has important implications for the understanding of neurological disorders and age-related brain changes.


Traffic Congestion Prediction Using Machine Learning Techniques

arXiv.org Artificial Intelligence

The prediction of traffic congestion can serve a crucial role in making future decisions. Although many studies have been conducted regarding congestion, most of these could not cover all the important factors (e.g., weather conditions). We proposed a prediction model for traffic congestion that can predict congestion based on day, time and several weather data (e.g., temperature, humidity). To evaluate our model, it has been tested against the traffic data of New Delhi. With this model, congestion of a road can be predicted one week ahead with an average RMSE of 1.12. Therefore, this model can be used to take preventive measure beforehand.


Modeling Weather-induced Home Insurance Risks with Support Vector Machine Regression

arXiv.org Machine Learning

Insurance industry is one of the most vulnerable sectors to climate change. Assessment of future number of claims and incurred losses is critical for disaster preparedness and risk management. In this project, we study the effect of precipitation on a joint dynamics of weather-induced home insurance claims and losses. We discuss utility and limitations of such machine learning procedures as Support Vector Machines and Artificial Neural Networks, in forecasting future claim dynamics and evaluating associated uncertainties. We illustrate our approach by application to attribution analysis and forecasting of weather-induced home insurance claims in a middle-sized city in the Canadian Prairies.


Travel time prediction for congested freeways with a dynamic linear model

arXiv.org Machine Learning

Accurate prediction of travel time is an essential feature to support Intelligent Transportation Systems (ITS). The non-linearity of traffic states, however, makes this prediction a challenging task. Here we propose to use dynamic linear models (DLMs) to approximate the non-linear traffic states. Unlike a static linear regression model, the DLMs assume that their parameters are changing across time. We design a DLM with model parameters defined at each time unit to describe the spatio-temporal characteristics of time-series traffic data. Based on our DLM and its model parameters analytically trained using historical data, we suggest an optimal linear predictor in the minimum mean square error (MMSE) sense. We compare our prediction accuracy of travel time for freeways in California (I210-E and I5-S) under highly congested traffic conditions with those of other methods: the instantaneous travel time, k-nearest neighbor, support vector regression, and artificial neural network. We show significant improvements in the accuracy, especially for short-term prediction.


Data-Driven Prediction Model of Components Shift during Reflow Process in Surface Mount Technology

arXiv.org Machine Learning

In surface mount technology (SMT), mounted components on soldered pads are subject to move during reflow process. This capability is known as self-alignment and is the result of fluid dynamic behaviour of molten solder paste. This capability is critical in SMT because inaccurate self-alignment causes defects such as overhanging, tombstoning, etc. while on the other side, it can enable components to be perfectly self-assembled on or near the desire position. The aim of this study is to develop a machine learning model that predicts the components movement during reflow in x and y-directions as well as rotation. Our study is composed of two steps: (1) experimental data are studied to reveal the relationships between self-alignment and various factors including component geometry, pad geometry, etc. (2) advanced machine learning prediction models are applied to predict the distance and the direction of components shift using support vector regression (SVR), neural network (NN), and random forest regression (RFR). As a result, RFR can predict components shift with the average fitness of 99%, 99%, and 96% and with average prediction error of 13.47 (um), 12.02 (um), and 1.52 (deg.) for component shift in x, y, and rotational directions, respectively. This enhancement provides the future capability of the parameters' optimization in the pick and placement machine to control the best placement location and minimize the intrinsic defects caused by the self-alignment.


Identifying Real Estate Opportunities using Machine Learning

arXiv.org Machine Learning

Abstract--The real estate market is exposed to many fluctuations in prices, because of existing correlations with many variables, some of which cannot be controlled or might even be unknown. Housing prices can increase rapidly (or in some cases, also drop very fast), yet the numerous listings available online where houses are sold or rented are not likely to be updated that often. In some cases, individuals interested in selling a house (or apartment) might include it in some online listing, and forget about updating the price. In other cases, some individuals might be interested in deliberately setting a price below the market price in order to sell the home faster, for various reasons. In this paper we aim at developing a machine learning application that identifies opportunities in the real estate market in real time, i.e., houses that are listed with a price substantially below the market price. This program can be useful for investors interested in the housing market. The application is formally implemented as a regression problem, that tries to estimate the market price of a house given features retrieved from public online listings. For building this application, we have performed a feature engineering stage in order to discover relevant features that allows attaining a high predictive performance. Several machine learning algorithms have been tested, including regression trees, k-NN and neural networks, identifying advantages and handicaps of each of them. The real estate market is rapidly evolving. A recent report published by MSCI estimates the size of the professionally managed real estate investment market in $8.5 trillion in 2017, increasing a total of $1.1 trillion since the previous year [1]. Of course, the real market size is expected to be much larger when counting assets which are not professionally managed or that are not object of investment. When looked from a macroeconomic perspective, there are many aspects that significantly drive the behavior of this market, such as demographics, interest rates, government regulation and, for short, global economic health. However, looking at the market evolution from a global perspective turns out to be too simplistic. Although the market at a global scale is very tightly correlated, there are many aspects influencing the behavior of markets at a local scale, such as political instability or the emergence of highly demanded "hot spots" that can shift rapidly.


Tensor Decompositions for Modeling Inverse Dynamics

arXiv.org Machine Learning

Modeling inverse dynamics is crucial for accurate feedforward robot control. The model computes the necessary joint torques, to perform a desired movement. The highly non-linear inverse function of the dynamical system can be approximated using regression techniques. We propose as regression method a tensor decomposition model that exploits the inherent three-way interaction of positions x velocities x accelerations. Most work in tensor factorization has addressed the decomposition of dense tensors. In this paper, we build upon the decomposition of sparse tensors, with only small amounts of nonzero entries. The decomposition of sparse tensors has successfully been used in relational learning, e.g., the modeling of large knowledge graphs. Recently, the approach has been extended to multi-class classification with discrete input variables. Representing the data in high dimensional sparse tensors enables the approximation of complex highly non-linear functions. In this paper we show how the decomposition of sparse tensors can be applied to regression problems. Furthermore, we extend the method to continuous inputs, by learning a mapping from the continuous inputs to the latent representations of the tensor decomposition, using basis functions. We evaluate our proposed model on a dataset with trajectories from a seven degrees of freedom SARCOS robot arm. Our experimental results show superior performance of the proposed functional tensor model, compared to challenging state-of-the art methods.


Characterization of the equivalence of robustification and regularization in linear and matrix regression

arXiv.org Machine Learning

The notion of developing statistical methods in machine learning which are robust to adversarial perturbations in the underlying data has been the subject of increasing interest in recent years. A common feature of this work is that the adversarial robustification often corresponds exactly to regularization methods which appear as a loss function plus a penalty. In this paper we deepen and extend the understanding of the connection between robustification and regularization (as achieved by penalization) in regression problems. Specifically, (a) in the context of linear regression, we characterize precisely under which conditions on the model of uncertainty used and on the loss function penalties robustification and regularization are equivalent, and (b) we extend the characterization of robustification and regularization to matrix regression problems (matrix completion and Principal Component Analysis).


Optimal $\gamma$ and $C$ for $\epsilon$-Support Vector Regression with RBF Kernels

arXiv.org Machine Learning

The objective of this study is to investigate the efficient determination of $C$ and $\gamma$ for Support Vector Regression with RBF or mahalanobis kernel based on numerical and statistician considerations, which indicates the connection between $C$ and kernels and demonstrates that the deviation of geometric distance of neighbour observation in mapped space effects the predict accuracy of $\epsilon$-SVR. We determinate the arrange of $\gamma$ & $C$ and propose our method to choose their best values.